PADS: Processing Arbitrary Data Streams
نویسندگان
چکیده
Often such streams are high-volume: AT&T’s call-detail stream contains roughly 300 million calls per day requiring approximately 7GBs of storage space. Typically, such stream data arrives “as is” in ad hoc formats with poor documentation. In addition, the data frequently contains errors. The appropriate response to such errors is applicationspecific. Some applications can simply discard unexpected or erroneous values and continue processing. For other applications, however, errors in the data can be the most interesting part of the data.
منابع مشابه
Privacy-Preserving Distributed Stream Monitoring
Applications such as sensor network monitoring, distributed intrusion detection, and real-time analysis of financial data necessitate the processing of distributed data streams on the fly. While efficient data processing algorithms enable such applications, they require access to large amounts of often personal information, and could consequently create privacy risks. Previous works have studie...
متن کاملPrivacy-Preserving Distributed Stream Monitoring (NDSS 2014)
Applications such as sensor network monitoring, distributed intrusion detection, and real-time analysis of financial data necessitate the processing of distributed data streams on the fly. While efficient data processing algorithms enable such applications, they require access to large amounts of often personal information, and could consequently create privacy risks. Previous works have studie...
متن کاملDENGRIS-Stream: A Density-Grid based Clustering Algorithm for Evolving Data Streams over Sliding Window
Evolving data streams are ubiquitous. Various clustering algorithms have been developed to extract useful knowledge from evolving data streams in real time. Density-based clustering method has the ability to handle outliers and discover arbitrary shape clusters whereas grid-based clustering has high speed processing time. Sliding window is a widely used model for data stream mining due to its e...
متن کاملFPGA Implementation of JPEG and JPEG2000-Based Dynamic Partial Reconfiguration on SOC for Remote Sensing Satellite On-Board Processing
This paper presents the design procedure and implementation results of a proposed hardware which performs different satellite Image compressions using FPGA Xilinx board. First, the method is described and then VHDL code is written and synthesized by ISE software of Xilinx Company. The results show that it is easy and useful to design, develop and implement the hardware image compressor using ne...
متن کاملAn Adaptive Grid-based Method for Clustering Multi- Dimensional Online Data Streams
Clustering is an important task in mining the evolving data streams. A lot of data streams are high dimensional in nature. Clustering in the high dimensional data space is a complex problem, which is inherently more complex for data streams. Most data stream clustering methods are not capable of dealing with high dimensional data streams; therefore they sacrifice the accuracy of clusters. In or...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003